Causal relationship and shared genes between air pollutants and amyotrophic lateral sclerosis: A large‐scale genetic analysis

Abstract Objective Air pollutants have been reported to have a potential relationship with amyotrophic lateral sclerosis (ALS). The causality and underlying mechanism remained unknown despite several existing observational studies. We aimed to investigate the potential causality between air pollutants (PM2.5, NOX, and NO2) and the risk of ALS and elucidate the underlying mechanisms associated with this relationship. Methods The data utilized in our study were obtained from publicly available genome‐wide association study data sets, in which single nucleotide polymorphisms (SNPs) were employed as the instrumental variantswith three principles. Two‐sample Mendelian randomization and transcriptome‐wide association (TWAS) analyses were conducted to evaluate the effects of air pollutants on ALS and identify genes associated with both pollutants and ALS, followed by regulatory network prediction. Results We observed that exposure to a high level of PM2.5 (OR: 2.40 [95% CI: 1.26–4.57], p = 7.46E‐3) and NOx (OR: 2.35 [95% CI: 1.32–4.17], p = 3.65E‐3) genetically increased the incidence of ALS in MR analysis, while the effects of NO2 showed a similar trend but without sufficient significance. In the TWAS analysis, TMEM175 and USP35 turned out to be the genes shared between PM2.5 and ALS in the same direction. Conclusion Higher exposure to PM2.5 and NOX might causally increase the risk of ALS. Avoiding exposure to air pollutants and air cleaning might be necessary for ALS prevention.


| INTRODUC TI ON
Amyotrophic lateral sclerosis (ALS) is a rare but fatal neurodegenerative disease with an annual incidence of 1-2.6/100,000 persons. 1,2The lifetime risk of ALS is estimated to be 1 in 400, and less than 10% of patients survive beyond 10 years. 3,46][7] In ALS, available treatments only prolong life expectancy and maximize the quality of life.Therefore, there is an urgent need to prevent and manage this devastating disease.ALS can be classified into two main categories: sporadic ALS (sALS) and familial ALS (fALS).The majority of ALS cases are sporadic, meaning they occur without a clear family history, while about 10% of cases are fALS. 8Although studies of genetic variation in fALS help us make significant strides in uncovering the underlying mechanism, 8,9 most cases of ALS are sporadic with no clear factors, which may be triggered by the combination of genetic predisposition, environmental exposure, and the passage of time.This widespread agreement is known as the gene-timeenvironment hypothesis.
Research into the environmental exposome has shed light on factors potentially associated with amyotrophic lateral sclerosis (ALS).A meta-analysis has summarized ALS's environmental risk factors, including exposure to heavy metals, organic chemicals, electric shocks, and physical injuries. 10While this analysis did not explore air pollution's impact on ALS, the significance of air pollutants is increasingly recognized in various diseases, including respiratory 11 and cardiovascular diseases, 12 as well as neurological disorders. 13,14erefore, air pollution's connection to ALS merits closer investigation in the context of these findings.
The causality between atmospheric pollution and ALS remained unknown despite multiple observational studies.For example, a study by Meinie et al. from the Netherlands involving 917 ALS patients and 2662 controls found a positive association between prolonged exposure to air pollutants from traffic sources and a higher chance of developing ALS. 15A recent Bayesian hierarchical analysis study similarly confirmed a highly positive correlation between ALS and elemental carbon concentration. 16But these studies above were limited by the inherent defects of observational studies, such as confounders and reverse causation, making it difficult to establish the causality.There is a pressing need to establish a causal relationship between air pollution and ALS.
We propose a two-sample Mendelian randomization (TSMR) to address this issue to investigate the potential causality between air pollution and ALS.The fundamental principle of Mendelian randomization (MR) relies on the instrumental variants (IVs) analysis to make causal estimates.Three assumptions are required for MR: (i) IVs are strongly associated with the exposure; (ii) IVs are not associated with confounders from exposure to outcome and (iii) IVs act on the outcome only via the exposure.It is usually implemented using single nucleotide polymorphisms (SNPs) as IVs, which follow Mendel's laws of random assortment of genotypes in the natural world to mimic the design of a randomized controlled trial.Treating genetic variants as instrumental variables, which are presumed to be allocated randomly before birth, minimizes the potential influence of environmental factors.Moreover, as these genetic variants are established well before the onset of the disease, issues pertaining to residual confounding and reverse causation, commonly encountered in conventional observational studies, are effectively addressed. 17,18th genome-wide association studies (GWAS) providing existing summary statistics, MR has been extensively applied across diverse research domains.0][21][22][23][24][25][26][27][28][29][30][31] Yi et al., for example, conducted a TSMR analysis reporting a causal link between air pollution and neurodegenerative disorders (Alzheimer's disease and Parkinson's diseases). 24Wang et al. demonstrated the causal evidence that air pollution might cause multiple cancer types by MR. 20 However, the causal relationship and underlying biological mechanisms between air pollution and ALS remains largely unexplored.Herein, we performed TSMR with existing GWAS data to assess our hypotheses that air pollution exposure may be causally linked to ALS.Furthermore, we conducted TWAS analysis based on the results of our MR, managing to explore possible mechanisms.

| ME THODS
The flowchart of the study is shown in Figure 1.

| Data sources
The data utilized in our study were obtained from publicly available genome-wide association study (GWAS) data sets and, therefore, do not require ethical approval or informed consent.All included GWAS data sets consisted of participants of European ancestry (Table S1), with no restrictions on gender, income, or education.
GWAS of exposure to air pollutants (PM2.5, NO X , NO 2 ) were obtained from the UK Biobank (www.ukbio bank.ac.uk). 32The level of air pollutants in the UK was estimated using a land-use regression model for the annual average 2010.The mean PM2.5 concentration was 9.99 ± 1.06 μg/m 3 , ranging from 8.17 The GWAS for NO 2 and NO x included 456,380 individuals and 9,851,867 SNPs.
The GWAS for ALS were obtained from the latest and largest meta-analysis by van Rheenen et al., 33 which included 27,250 cases with familial or sporadic ALS and 110,881 control subjects.The participants of these GWAS were all European descent from European countries and the United States.The ALS cases in this large-scale meta-analysis were derived from independent cohorts and diagnosed by the EI Escorial criteria.

| Selection of instrumental variants
Three principles were followed to select IVs in this study. 34First, IVs were required to exhibit strong and independent correlations with the corresponding exposure.As there were few SNPs under the threshold of 5e-8, we set a stringent threshold of p < 1e-6 to identify SNPs that demonstrated a strong correlation with the exposures of interest as in previous studies. 17Next, we employed the PLINK algorithm, with LD <0.001 and <10 MB distance from the index variant, to perform clumping and select independent IVs.Additionally, SNPs with F statistics <10 were excluded to guarantee the robustness of the IVs.Second, IVs were required to be unrelated to potential confounding factors such as body mass index (BMI), blood pressure, and smoking behavior.We conducted SNP lookups in the PhenoScanner database (http:// pheno scann er.medsc hl.cam.ac.uk) to exclude any SNP with known associations with these confounding factors.Last, IVs were expected to be independent of the outcome and exert their influence solely through exposure.Thus, SNPs with a significant correlation with the outcome were excluded.
F I G U R E 1 .Flowchart and study design of TSMR and TWAS.SNPs from publicly available GWAS data sets were selected as IVs based on their strong correlation with the exposure and independence from confounding factors.These IVs were required to influence the outcome solely through exposure, ensuring the credibility of the MR analysis.Furthermore, the GWAS data were converted into TWAS format to identify gene transcripts associated with air pollutants and ALS.The figure was created by Biore nder.com.GWAS, genome-wide association study; IVs, instrumental variants; SNP, single nucleotide polymorphism; TSMR, two-sample Mendelian randomization; TWAS, transcriptomewide association.

| Two-sample Mendelian randomization (TSMR)
For the TSMR, random effects inverse variance weighting (IVW) was used as the primary method. 76IVW entailed a weighted regression of IV effects on the outcome, assuming a constrained intercept of zero, thus offering optimal statistical power.However, in the presence of horizontal pleiotropy, the outcome could be influenced by causal pathways other than the exposure itself. 358][79] The weighted median approach selected the median of MR estimates for causal estimation, while MR-Egger regression allowed for estimating the intercept as a measure of average pleiotropy.MR-PRESSO allowed for identifying the potential pleiotropic IVs and re-estimation after excluding these outliers. 37TSMR analysis was performed to assess the effects of air pollutants on ALS.To account for multiple testing, the p-value below the Bonferroni-corrected threshold of 1.67E-2 (0.05/3) was deemed as statistically significant. 80nsitivity analyses were conducted to evaluate the robustness of the findings, including tests for heterogeneity and horizontal pleiotropy.Heterogeneity was assessed using Cochran's Q test, while horizontal pleiotropy was examined through MR-PRESSO and MR-Egger intercept test. 34,81Although based on different assumptions, these tests fundamentally measured the extent to which the impact of one or more instrumental SNPs was exaggerated, not only through the hypothesized pathway but also through other unaccounted-for causal pathways.
All statistical analyses were performed using R software.The "TwoSampleMR" package in R was utilized for data extraction, SNP clumping, harmonization, and TSMR. 82

| Transcriptome-wide association (TWAS) analysis and joint/conditional tests
To conduct transcriptomic imputation, we employed the FUSION method, 38 which involved converting GWAS data into TWAS format.In this approach, a linear model based on expression quantitative trait loci was utilized to predict gene expression levels using the RNA-seq of Genotype-Tissue Expression version 8 (GTEx v8) (N = 183), 39 CommonMind Consortium's (N = 452), and splicing (N = 452) reference 40 as the reference panels of brain.Genes that exhibited significant associations with ALS were first selected.Then, among these ALS-associated genes, the genes showed significant associations with air pollutants and were identified as potential mechanisms of air pollution to ALS.Bonferroni correction was conducted to account for multiple TWAS tests.The p-value in TWAS below 0.05 but higher than the Bonferroni-corrected p-value was deemed to be a suggestive association.
To test how much GWAS signal and TWAS genes remain in a locus after the association of the significant genes in TWAS is removed, we performed joint/conditional tests in FUSION by FUSION.post_process and FUSION.assoc_test.

| Protein interaction and network prediction
We used GeneMANIA (http:// genem ania.org/ ) to predict the protein-protein interaction (PPI) of the genes that are significant in TWAS analysis. 41Detailed information on the included data sets in GeneMANIA is described somewhere else. 41

| RE SULTS
After a strict filter, 14, 20, and 19 SNPs were selected as the IVs for PM2.5, NO 2 , and NO X (Tables S2-S4).The F-statistics were all above 10, showing strong robustness for the representation of the exposures.
Altogether, our analyses suggested a causal link between higher exposure to PM2.5, NO X , and increased risk of ALS, whereas NO 2 had no causal effects on ALS.
For the sensitivity analyses, both MR PRESSO and MR egger showed no significant pleiotropy in all TSMR analyses (p > 0.05) (Table 1).Both MR egger and IVW in Cochran's Q test also showed no significant heterogeneity in all TSMR (p > 0.05) (Table 1).Therefore, our selected IVs and TSMR results showed great robustness.
To investigate the potential mechanism of air pollutantsinducing ALS, we conducted a TWAS analysis.In total, eight genes were significantly associated with ALS in TWAS and exhibited the same direction with PM2.5/NO X among the panels of 9130 genes (p < 5.48E-6, 0.05/9130) (Table 2 and Table S5).For PM2.5, USP35 and TMEM175 were significantly associated with the phenotype of higher exposure to PM2.5 (p < 6.23E-3, 0.05/8).The joint/conditional tests showed that USP35 and TMEM175 were independently and strongly associated with ALS and PM2.5 in the corresponding locus (Figure 3).After excluding these two genes, the GWAS signal dropped.These results suggested that air pollutants might induce ALS through pathways related to USP35 and TMEM175.Then, we performed PPI analysis to identify the protein potentially interacting with USP35 and TMEM175 (Figure 4).
Meanwhile, for NO X , USP35 and TMEM175 only showed suggestive association in the TWAS analysis.Interestingly, C9orf72, as the identified risky gene for ALS, showed significant association with ALS (p = 9.41E-33) and suggestive association with both PM2.5 (p = 2.31E-2) and NO X (p = 2.61E-2), indicating that C9orf72 might also exert certain effects from air pollutants to ALS.ALS5 (SPG11), another causative gene for ALS, showed trends of association with exposure to PM2.5 (p = 0.14) and NO X (p = 0.26) (Table S5).

| DISCUSS ION
We used TSMR to investigate potential causal links between air pollution (including PM2.5, NO X , and NO 2 ) and ALS.We found that higher exposure to PM2.5 and NO X might be causally associated with increasing the risk of ALS.The relationship between NO 2 and ALS showed a little positive trend but did not prove statistically significant.We also revealed that USP35 and TMEM175 potentially played important roles in air pollutants-inducing ALS.PM2.5 is an airborne particle in the atmosphere with a diameter of 2.5 μm or less that can be breathed by people. 42PM2.5 was conventionally considered to lead to many respiratory diseases after inhaling. 43,44Recently, mounting evidence increasingly suggests that exposure to PM2.5 also impaired the central nervous system. 45,46th a diameter of <2.5 μm, this fine particulate matter can reach F I G U R E 2 Forest plots illustrating the two-sample Mendelian randomization (TSMR) estimates of the effects of air pollutants (PM2.5, NO X , and NO 2 ) on the risk of amyotrophic lateral sclerosis (ALS).Each circle represents an individual instrumental variant (IV) with the corresponding odds ratio (OR) and 95% confidence interval (CI) indicated by the horizontal line.The p-value indicates the statistical significance of the association between exposure and ALS risk.IVW (inverse variance weighting) was used as the primary method, with additional methods (weighted median, MR-Egger, and Mendelian Randomization Pleiotropy RESidual Sum and Outlier [MR-PRESSO]) employed as supplementary analyses.the brain from the nasal cavity via two main pathways: directly penetrating olfactory epithelium or entering the circulation after traveling deep into the lungs and traversing the blood-brain barrier (BBB). 47,48rst, PM2.5 can cause mitochondrial dysfunction, traditionally considered one of ALS's four major pathophysiological mechanisms, including elevated production of reactive oxygen species (ROS) and reduced mitochondrial membrane potential. 49,50The former affects the electron transport chain's electron transfer, whereas the latter promotes oxidative stress, resulting in neuron cell death and BBB dysfunction. 482][53] A meta-analysis including 26 studies conducted reported a significant association between long-term PM2.5 exposure and stroke, dementia, Alzheimer's disease, ASD, and Parkinson's disease. 54For the first time, we have identified significant causality between exposure to PM2.5 and the occurrence of ALS, built upon population-based genetic analyses.
NO X is a group of gases primarily emitted from combustion processes, such as emissions from vehicles and industry, which are widely reported to have detrimental effects on human health. 55Currently, several observational studies reported that long-term exposure to NO X was associated with a higher risk of ALS.However, there is still a lack of causality inference and experimental investigation, TA B L E 2 TWAS results for shared genes between ALS and air pollutants.F I G U R E 3 Joint/conditional plots of TWAS.All the genes in the locus were shown in the top panel.The genes that show a marginal association with TWAS are marked in blue, while the genes that exhibit a joint significance are highlighted in green.The lower panel displays a Manhattan plot illustrating the GWAS data before (gray) and after (blue) conditioning on the green genes.The GWAS signals dropped after conditioning the predicted expression of TMEM175(A,B) and USP35(C,D).GWAS, genome-wide association study; TWAS, transcriptomewide association.although NO X exposure is associated with numerous ALS-related pathways, 56 such as oxidative stress 57 and neuronal death. 58,59In this study, we reported for the first time that NO X might be causally associated with ALS risks.
Our investigation of the transcriptomic relationship between air pollutants and ALS revealed that TMEM175 and USP35 might intermediate from PM2.5 to ALS.TMEM175 is a lysosomal ion channel that assists in the digestion of abnormal proteins and mitochondrial homeostasis.Dysfunction of TMEM175 was correlated with multiple neurologic disorders. 60For example, deficiency in TMEM175 could cause neuron death, motor impairment, and Parkinson's disease. 61In the brain of ALS, the level of TMEM175 was also reported to be abnormally decreased, which was consistent with our results. 62anwhile, the downstream of TEME175, including homeostasis of protein and mitochondria, is the typical pathophysiology of ALS. 56wever, the biological effects of TMEM175 exerted in ALS and how PM2.5 influences the function of TMEM175 need to be further explored in the laboratory.
USP35 is an enzyme of the deubiquitinase family, which removes ubiquitin molecules and regulates protein homeostasis.In ALS, deubiquitinase plays a pivotal role in the pathogenesis.The deubiquitinase could regulate proteotoxicity and control the protein quality, thus influencing the development of ALS.Inhibiting deubiquitinase could protect against proteotoxicity from ALS. 63 Besides, USP35 could regulate PARK2-mediated mitophagy and mitochondria quality control. 64fects in mitochondria function are related to multiple ALS pathologic activities, including neuronal calcium homeostasis, autophagy, and axonal degeneration. 56Additionally, the proteins we predicted that interacted with USP35 have also been reported in ALS.For example, SMURF2 was reported to be immunopositive in ALS and colocalize with TDP-43, a known causative protein of ALS. 65orf72, widely regarded as the most common genetic cause of ALS, 66 was significant in our ALS TWAS analysis.The mechanisms of C9orf72 inducing ALS were well elucidated in previous studies. 56,66r TWAS analysis found that the C9orf72 expression is suggestively associated with PM2.5 and NO X .Unfortunately, it did not pass Bonferroni's correction.Considering the importance of C9orf72 in ALS, this suggestive evidence should not be neglected.It is reported that PM2.5 may have an unclear mechanism for DNA methylation, 67,68 which acts as a gene silencer to suppress the production of certain DNA pieces, such as the C9orf72 expansion. 69,70Thus, one potential epigenetic explanation is that PM2.5-related demethylation of C9orf72 expansion induces the expression of RNA foci and DPR expression.Further exploration is needed to elucidate whether air pollutants could influence the level or function of C9orf72 and the underlying mechanisms involved.
ALS5 (SPG11) is the major gene causing autosomal recessive ALS. 71We found trends in the association of ALS5 with PM2.5 and NO X .Considering air pollutants might affect levels of multiple proteins, 72 induce genetic mutations, 73 and potentially cause ALS, studies with large sample sizes in the future might shift this trend into significance.
Limitations of our study should be taken into consideration.
First, our findings need more experimental validation.Second, the data sets in this study consisted of European populations, limiting the generalizability to other ethnics.Last, as a context-dependent and environment-related GWAS, the IVs of air pollutants might not be the perfect proxy for intrinsic measurement.The LD-score ratio of 53%-63% in air pollutants GWAS (collected from the IEU database) can be interpreted as certain proportion of signals in these GWAS coming from potential confounders, likely from population structure, rather than polygenic signals (Table S6).Three reasons  74 and is strongly correlated with participant location.These issues could probably restrict the relevance and independence assumptions of MR analysis to a certain degree. 75 summarize, our study has established a causal relationship between exposure to air pollutants and ALS using MR analysis based on the largest and latest GWAS.Our findings suggest that PM2.5 and NO X exposure is associated with an increased incidence of ALS, while NO 2 exposure did not have a statistically significant effect.
Through transcriptome-wide association studies, we identified that TMEM175 and USP35 might intermediate from PM2.5 to ALS, related to the homeostasis of proteome and mitochondria.
Our study contributes to a growing body of evidence that environmental exposures, such as air pollution, may play a role in the development of ALS.These findings highlight the need for further research to validate our results and explore potential preventive measures for this devastating disease.

F I G U R E 4
Protein-protein interaction plot for USP35 and TMEM175.underlay this potential limitation: (1) while the IVs of air pollution were statistically significant in the UK Biobank data set, their biological significance and transferablity needs to be further validated; (2) the measurement of air pollutant exposure in the GWAS is based on participant home address and may contain bias from home address change across lifetime; (3) the air pollutants GWAS are likely confounded by imperfectly corrected latent population structure which has been previously shown to affect GWAS in the UK Biobank in spite of stringent corrections Sensitivity analyses for two-sample Mendelian randomization.